Main
David Zhang
I am a bioinformatician who focusses on data analysis, visualisation and method development. The goal of my work is to improve the molecular diagnosis rate of patients with rare disorders. For this purpose, I specialise in developing algorithms that integrate large-scale genomic and transcriptomic datasets in order to detect aberrant, disease-causing events.
Education
Research assistant, part-time PhD, Bioinformatics
University College London
London, UK
Present - 2017
- Thesis: Using transcriptomics to improve the diagnosis rate of rare disease patients.
- The goal of my PhD is to develop and apply software that improve the genetic diagnosis rates using RNA-sequencing.
MSc, Neuroscience
University College London
London, UK
2016 - 2015
- Thesis: The role of mitochondrial dysfunction in Xerodoma pigmentosum
- Grade: Merit (68%)
- Awarded post-graduate support scheme bursary (£10,000)
BSc, Biomedical science
University College London
London, UK
2015 - 2012
- Thesis: Investigating the function of CYFIP1 in the development of rat hippocampal neurons.
- Grade: 2:1 (69%)
H.S.
Queen Elizabeth’s School
Barnet, UK
2012 - 2007
- Grade: Maths (A*), Biology (A*), Chemistry (A*), Sociology (A).
Research Experience
Honorary Researcher (2 months)
Johns Hopkins Bloomberg School of Public Health
Remote
2020
- In collaboration with Leonardo Collado-Torres, we used the recount3 dataset and LIBD samples to study the effect of complex splicing in individuals with neurological disease.
Research Technician
University College London
London, UK
2017 - 2016
- Used R and bash to investigate the effect of genetic variation on the age of onset of dementia and cognition within Down syndrome patients.
Industry Experience
Bioinformatician internship (3 months)
Verge Genomics
Remote
2020
- Detection of aberrant splicing events in complex disease patients.
- Used AWS infrastructure to analyse 100s of RNA-seq samples derived from patients with Parkinson’s disease and amyotrophic lateral sclerosis
Software & programming
Bioconductor packages
N/A
N/A
Present - 2020
Data science blog
N/A
N/A
2021
Data wrangling
Neuroimmunology & CSF Laboratory, NHS
London, UK
2018 - 2016
- Developer and maintainer of data wrangling pipelines that improved the efficiency and standardisation of monthly financial reports.
Teaching Experience
Developing Bioconductor Packages
University College London
Virtual Event
2020
Unit testing using testthat edition 3
rstats club
Virtual Event
2020
- Talk regarding unit testing fundamentals, the importance of testing and new features released in the R package testthat edition 3.
R fundamentals
Clinician Coders
London, UK
2020 - 2018
- Developed materials and lead workshops that aimed to teach R fundamentals to clinicians.
Selected Publications
Megadepth: efficient coverage quantification for BigWigs and BAMs
Bioinformatics
N/A
2021
- Wilks C, Ahmed O, Baker DN, Zhang D, Collado-Torres L, Langmead B. 2021. Megadepth: efficient coverage quantification for BigWigs and BAMs. Bioinformatics.
- Role: R package developer.
- DOI: https://doi.org/10.1101/2020.12.17.423317
Integration of eQTL and Parkinson’s disease GWAS data implicates 11 disease genes
Jama Neurology
N/A
2021
- Kia DA, Zhang D, Guelfi S, Manzoni C, Hubbard L, United Kingdom Brain Expression Consortium (UKBEC), International Parkinson’s Disease Genomics Consortium (IPDGC), Reynolds RH, Botía JA, Ryten M, Ferrari R, Lewis PA, Williams N, Trabzuni D, Hardy J, Wood NW. 2021. Integration of eQTL and Parkinson’s disease GWAS data implicates 11 disease genes. Jama Neurology.
- Role: Co-first author.
- DOI: https://doi.org/10.1001/jamaneurol.2020.5257
Incomplete annotation of disease-associated genes is limiting our understanding of Mendelian and complex neurogenetic disorders.
Science advances
N/A
2020
- Zhang D, Guelfi S, Ruiz SG, Costa B, Reynolds RH, D’Sa K, Liu W, Courtin T, Peterson A, Jaffe AE, Hardy J, Botia JA, Collado-Torres L and Ryten M. 2020. Incomplete annotation of disease-associated genes is limiting our understanding of Mendelian and complex neurogenetic disorders. Science Advances.
- Role: First Author.
- DOI: https://doi.org/10.1126/sciadv.aay8299